29 research outputs found
A generalized characterization of algorithmic probability
An a priori semimeasure (also known as "algorithmic probability" or "the
Solomonoff prior" in the context of inductive inference) is defined as the
transformation, by a given universal monotone Turing machine, of the uniform
measure on the infinite strings. It is shown in this paper that the class of a
priori semimeasures can equivalently be defined as the class of
transformations, by all compatible universal monotone Turing machines, of any
continuous computable measure in place of the uniform measure. Some
consideration is given to possible implications for the prevalent association
of algorithmic probability with certain foundational statistical principles
Putnam's Diagonal Argument and the Impossibility of a Universal Learning Machine
The diagonalization argument of Putnam (1963) denies the possibility of a universal learning machine. Yet the proposal of Solomonoff (1964) and Levin (1970) promises precisely such a thing. In this paper I discuss how their proposed measure function manages to evade Putnam's diagonalization in one respect, only to fatally fall prey to it in another
Universal Prediction
In this thesis I investigate the theoretical possibility of a universal method of prediction. A prediction method is universal if it is always able to learn from data: if it is always able to extrapolate given data about past observations to maximally successful predictions about future observations. The context of this investigation is the broader philosophical question into the possibility of a formal specification of inductive or scientific reasoning, a question that also relates to modern-day speculation about a fully automatized data-driven science.
I investigate, in particular, a proposed definition of a universal prediction method that goes back to Solomonoff (1964) and Levin (1970). This definition marks the birth of the theory of Kolmogorov complexity, and has a direct line to the information-theoretic approach in modern machine learning. Solomonoff's work was inspired by Carnap's program of inductive logic, and the more precise definition due to Levin can be seen as an explicit attempt to escape the diagonal argument that Putnam (1963) famously launched against the feasibility of Carnap's program.
The Solomonoff-Levin definition essentially aims at a mixture of all possible prediction algorithms. An alternative interpretation is that the definition formalizes the idea that learning from data is equivalent to compressing data. In this guise, the definition is often presented as an implementation and even as a justification of Occam's razor, the principle that we should look for simple explanations.
The conclusions of my investigation are negative. I show that the Solomonoff-Levin definition fails to unite two necessary conditions to count as a universal prediction method, as turns out be entailed by Putnam's original argument after all; and I argue that this indeed shows that no definition can. Moreover, I show that the suggested justification of Occam's razor does not work, and I argue that the relevant notion of simplicity as compressibility is already problematic itself
The Meta-Inductive Justification of Induction: The Pool of Strategies
This paper poses a challenge to Schurz's proposed meta-inductive justification of induction. It is argued that Schurz's argument requires a notion of optimality that can deal with an expanding pool of prediction strategies
Universal Prediction
In this dissertation I investigate the theoretical possibility of a universal method of prediction. A prediction method is universal if it is always able to learn what there is to learn from data: if it is always able to extrapolate given data about past observations to maximally successful predictions about future observations. The context of this investigation is the broader philosophical question into the possibility of a formal specification of inductive or scientific reasoning, a question that also touches on modern-day speculation about a fully automatized data-driven science.
I investigate, in particular, a specific mathematical definition of a universal prediction method, that goes back to the early days of artificial intelligence and that has a direct line to modern developments in machine learning. This definition essentially aims to combine all possible prediction algorithms. An alternative interpretation is that this definition formalizes the idea that learning from data is equivalent to compressing data. In this guise, the definition is often presented as an implementation and even as a justification of Occam's razor, the principle that we should look for simple explanations.
The conclusions of my investigation are negative. I show that the proposed definition cannot be interpreted as a universal prediction method, as turns out to be exposed by a mathematical argument that it was actually intended to overcome. Moreover, I show that the suggested justification of Occam's razor does not work, and I argue that the relevant notion of simplicity as compressibility is problematic itself
Solomonoff Prediction and Occam's Razor
Algorithmic information theory gives an idealized notion of compressibility, that is often presented as an objective measure of simplicity. It is suggested at times that Solomonoff prediction, or algorithmic information theory in a predictive setting, can deliver an argument to justify Occam's razor. This paper explicates the relevant argument, and, by converting it into a Bayesian framework, reveals why it has no such justificatory force. The supposed simplicity concept is better perceived as a specific inductive assumption, the assumption of effectiveness. It is this assumption that is the characterizing element of Solomonoff prediction, and wherein its philosophical interest lies
The Meta-Inductive Justification of Induction
I evaluate Schurz's proposed meta-inductive justification of induction, a refinement of Reichenbach's pragmatic justification that rests on results from the machine learning branch of prediction with expert advice.
My conclusion is that the argument, suitably explicated, comes remarkably close to its grand aim: an actual justification of induction. This finding, however, is subject to two main qualifications, and still disregards one important challenge.
The first qualification concerns the empirical success of induction. Even though, I argue, Schurz's argument does not need to spell out what inductive method actually consists in, it does need to postulate that there is something like the inductive or scientific prediction strategy that has so far been *significantly* more successful than alternative approaches. The second qualification concerns the difference between having a justification for inductive method and for sticking with induction *for now*. Schurz's argument can only provide the latter. Finally, the remaining challenge concerns the pool of alternative strategies, and the relevant notion of a meta-inductivist's optimality that features in the analytical step of Schurz's argument. Building on the work done here, I will argue in a follow-up paper that the argument needs a stronger *dynamic* notion of a meta-inductivist's optimality
Putnam's Diagonal Argument and the Impossibility of a Universal Learning Machine
Putnam (1963) construed the aim of Carnap's program of inductive logic as the specification of an "optimum" or "universal" learning machine, and presented a diagonal proof against the very possibility of such a thing. Yet the ideas of Solomonoff (1964) and Levin (1970) lead to a mathematical foundation of precisely those aspects of Carnap's program that Putnam took issue with, and in particular, resurrect the notion of a universal learning machine.
This paper takes up the question whether the Solomonoff-Levin proposal is successful in this respect. I expose the general strategy to evade Putnam's argument, leading to a broader discussion of the outer limits of mechanized Bayesian induction. I argue that this strategy ultimately still succumbs to diagonalization, reinforcing Putnam's impossibility claim
Universal Prediction
In this thesis I investigate the theoretical possibility of a universal method of prediction. A prediction method is universal if it is always able to learn from data: if it is always able to extrapolate given data about past observations to maximally successful predictions about future observations. The context of this investigation is the broader philosophical question into the possibility of a formal specification of inductive or scientific reasoning, a question that also relates to modern-day speculation about a fully automatized data-driven science.
I investigate, in particular, a proposed definition of a universal prediction method that goes back to Solomonoff (1964) and Levin (1970). This definition marks the birth of the theory of Kolmogorov complexity, and has a direct line to the information-theoretic approach in modern machine learning. Solomonoff's work was inspired by Carnap's program of inductive logic, and the more precise definition due to Levin can be seen as an explicit attempt to escape the diagonal argument that Putnam (1963) famously launched against the feasibility of Carnap's program.
The Solomonoff-Levin definition essentially aims at a mixture of all possible prediction algorithms. An alternative interpretation is that the definition formalizes the idea that learning from data is equivalent to compressing data. In this guise, the definition is often presented as an implementation and even as a justification of Occam's razor, the principle that we should look for simple explanations.
The conclusions of my investigation are negative. I show that the Solomonoff-Levin definition fails to unite two necessary conditions to count as a universal prediction method, as turns out be entailed by Putnam's original argument after all; and I argue that this indeed shows that no definition can. Moreover, I show that the suggested justification of Occam's razor does not work, and I argue that the relevant notion of simplicity as compressibility is already problematic itself